Dynamics of gene expression under feedback 
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Gene expression is a stochastic process governed by the presence of specific transcription factors. 
Here we study the dynamics of gene expression in the presence of feedback, where a gene regulates its 
own expression. The nonlinear coupling between input and output of gene expression can generate 
a dynamics different from simple scenarios such as the Poisson process. This is exemplified by our 
findings for the time intervals over which genes are transcriptionally active and inactive. We apply 
our results to the lac system in E. coli, where parametric inference on experimental data results in 
a broad distribution of gene activity intervals. 
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Gene expression is a dynamic process, which transfers 
genetic information from DNA to functional molecules 
such as proteins [.IJ. This process is controlled by spe- 
cific proteins, called transcription factors, which bind to 
DNA typically near the starting point of a gene. Tran- 
scription factors can act as enhancers or as repressors of 
gene transcription by attracting or impeding the molec- 
ular machinery which transcribes a gene. This machin- 
ery, called RNA-polymerase, produces m(essenger)RNA 
molecules from the DNA template. A single mRNA tran- 
script is later translated to several copies of polypeptide 
chains, which fold into proteins. 

Due to the low copy number of the specific molecules 
typically present in a cell, these processes are intrinsi- 
cally stochastic. Thus genes can be thought of as 'tog- 
gling' at random points in time between transcription- 
ally active and inactive states [2]- One manifestation of 
this stochasticity is cell-to-cell variations of mRNA and 
protein numbers in populations of genetically identical 
cells 0. 

The time intervals over which the gene is transcription- 
ally active can be very short. Frequently, only a single 
mRNA molecule is produced before an enhancer molecule 
unbinds again from the regulatory region of a gene, or a 
repressor molecule binds, causing a change in the tran- 
scriptional state of the gene [J]. For short gene-on times, 
individual mRNA molecules are produced in statistically 
independent events, which can be modelled by a Pois- 
son process. As a result, fluctuations in mRNA numbers 
follow Poisson statistics. This picture of gene expres- 
sion dynamics is frequently referred to as the Poisson 
scenario [5|. 

A similarly simple picture emerges if a gene is tran- 
scriptionally active long enough to allow for multiple 
mRNA molecules to be produced. At constant concen- 
tration of transcription factors, binding and unbinding 
of transcription factors to DNA takes place at constant 
rates. Hence the time intervals over which a gene is ac- 
tive or inactive are distributed exponentially. mRNA 
molecules are produced while the gene is active, leading 
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FIG. 1: Feedback in the lac system. This schematic pic- 
ture shows transcription and translation of the lacY gene to 
a protein which facilitates the uptake of lactose (a sugar) and 
its chemical analogues from the environment. Expression of 
this gene is repressed by the transcription factor Lad. Indi- 
vidual sugar molecules {e.g. in the form of allolactose) bind 
to the transcription factor, rendering the transcription fac- 
tor less likely to bind to its binding site on DNA. The sugar 
molecules thus act as effective inducers of lacY expression. 
Lad represses also other genes {lacA and lacZ), which encode 
enzymes used to digest lactose. The feedback loop ensures 
that lac genes are repressed in the absence of inducing sugar 
molecules in the environment. 



to bursts in mRNA numbers (5| . Both exponentially dis- 
tributed gene-on times, and transcriptional bursts have 
recently been observed experimentally [6|, |7[ ■ 

This simple picture of expression dynamics must break 
down in the presence of feedback, which is the subject 
of this paper: Direct or indirect coupling between the 
transcript level of a gene and its transcription rate can 
introduce a nontrivial dynamics. 

An example of feedback is direct autoregulation, which 
is pervasive in bacteria. Feedback is typically non-linear, 
since binding of transcription factors to DNA or to other 
molecules saturates at high concentrations. Feedback can 
play crucial functional roles, for instance in the lac sys- 
tem, which controls the uptake of sugar in bacteria. The 
gene lac Y regulates its expression by inactivating its own 



repressor (a doubly- negative feedback loop, see Fig. [T]). 

The consequences of feedback for cell-to-cell variability 
have been studied both experimentally @] and theoret- 
ically [9I, [lOJ, [ill. The analysis of the dynamics in au- 
toregulatory systems, however, has been limited to lin- 
ear models [12|. In this Letter, we analyze the dynamics 
of regulatory systems with non-linear feedback kinetics. 
The dynamic effects of feedback turn out to be particu- 
larly marked in the case of the nonlinear doubly-negative 
feedback, as in the lac system, where a broad distribution 
of gene-on times emerges. 

We consider a system consisting of a single gene with 
transcriptional state S{t) and number of proteins Y(t) 
present in the cell at time t. The state S = stands 
for a transcriptionally inactive gene, and S = 1 for an 
active gene. The dynamics is assumed to be Markovian. 
To model feedback, both the rate at which the gene goes 
from the inactive to the active state, a(Y), and from 
the active to the inactive state, P{Y), can depend on the 
number of proteins Y. In the gene-on state, the gene 
produces mRNA molecules according to a Poisson pro- 
cess with rate of production 7. Each mRNA molecule is 
translated to a geometrically distributed number of pro- 
13[, with mean number p, before decaying. 
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number of proteins Y decreases at a rate 77, which is 
largely due to dilution by cell division: 
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where the rate 7 is multiplied by S in the second equa- 
tion to take into account that mRNA is produced in the 
active state only. This model assumes a separation of 
time scales between the mRNA and protein dynamics, 
with a fast production of proteins from mRNA, and ne- 
glects any time delays between a change in the protein 
level and its effect on the gene activation and inactiva- 
tion rates. Models that explicitly include the number of 
mRNA molecules lead to very similar results as tested in 
numerical simulations. 

Fig. [2] shows a sample path of the model indicating 
both the number of proteins Y and the transcriptional 
state of the cell. Fig.[2]also shows the result of a piecewise 
deterministic approximation, which assumes a continuous 
deterministic increase of the number of proteins at rate 
jp—r]Y{t) when the gene is on, and an analogous decrease 
at rate r]Y{t) when the gene is off. 

The piecewise deterministic approximation suggests a 
self-contained model, which retains the transitions be- 
tween the transcriptional states as the only source of 
stochasticity. The protein level dynamics Y{t) in this 
approximative model thus consists of deterministic expo- 
nentially increasing and decreasing paths joined together 
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FIG. 2: Sample path of the model. The solid curve shows 
the number of proteins y as a function of time. The number 
of proteins tends to increase when the gene is active (S = 1, 
dark background), and decreases when the gene is inactive 
(5 = 0, light background). The dashed curve shows the piece- 
wise deterministic approximation, see text. The sample path 
includes an instance where the gene is on for a brief time at 
ca. 330 minutes, but no transcription takes place. The model 
parameters were inferred for the lac system, see footnote [22] . 
The lengths of the gene-on intervals can exceed the cell divi- 
sion time (here 216 min) because the protein concentrations 
are inherited [a|. 



at randomly positioned switching times of the transcrip- 
tional state S. The switching times themselves depend on 
the values of Y{t) because of the feedback. Such models 
are known as piecewise deterministic Markov processes 
in the mathematical literature [l^ and, specifically to 
describe feedback, feedback fluid queues l9\. In the con- 
text of gene regulation, such approximations have been 
considered by Kepler and Elston [9|. 

The forward Kolmogorov (master) equation of the 
piecewise deterministic approximation reads 

dtPoiy,t) = -dy[i-7jy)pa{y,t)] (4) 

+P{y)pi{y,t) -a{y)po{y,t) 
dtPi{y,t) = -dy[{jp-riy)pi{y,t)] (5) 

+(^{y)po{y,t) - (3{y)pi{y,t) , 

where Pi (2/, t) denotes the probability of the event {S{t) — 
i,Y{t) — y}. In this approximation, y has an upper 
bound A :— 7p/?7, which is determined by the balance of 
protein production and degradation in the gene-on state. 
Under the biologically reasonable assumption that the 
rate functions a and (3 are bounded away from zero on 
[0, A], the stationary solution to Eqs. ^ and (O reads 
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(A-y)(/)i(y) = #o(y) , 
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where the subindices again refer to the transcriptional 
state. The marginal distribution for the number of pro- 
teins Y is given by the sum of the distributions in Eqs.® 



and dll); 
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The constant Z in Eq. ([5]) normalizes this distribution. 

In the following, we derive the distribution of gene- 
on times in a stationary process, the analysis of gene-off 
times being analogous. We define T as the moment of the 
first gene inactivation after time ip = in a process that 
was started at t = — oo, and consider only those paths 
of the process that have an activation event taking place 
immediately after t^. The probabilities conditional on 
the occurrence of such an event are known in the mathe- 
matical literature as Palm probabilities 1^] and denoted 
by P°. For example, one obtains for the probability that 
the protein level at the time of activation is less than or 
equal to y that [ij] 

P\Y{Q) <y) = Z„-i /' a{y)My) dy , (9) 

Jo 

where Zq = L a{y)(j)Q{y) dy is a normalization constant. 
The probability that the gene is active longer than for a 
given time t is then 

P°{T >t)=E° exp ( - f p{Y{t)) dt\ (10) 



Zq / a(2/)</'o(y)exp 
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(11) 



In Eq. Uni), Y{t) = A-(A-r(0))e-''* is an exponentially 
increasing trajectory with a random initial state Y(0), 
and E'^ is the expectation with respect to P'^. Eq. (fTTj) 
follows from Eq. (fTOl) by a change of variable. 

We evaluate these quantities for a concrete example 
based on the lac system. The activation rate a is taken 
to be independent of K as a first approximation (allowing 
for F-dependent activation rates ,17] does not change the 
general conclusions) , whereas the gene inactivation rate /3 
depends of protein level Y: At high protein levels, the cell 
takes up sugar molecules from the environment at a high 
rate, leadiiig to a high steady state concentration of sugar 
in the cell [^ . This leads to a decreased number of active 
repressors, as discussed in Fig[TJ The gene inactivation 
rate can be written as the product of the binding rate b 
for active repressors and the fraction of active repressors 
A(y), which is of the Michaelis-Menten form pjj 



p{Y) = bXiY) 



1 + {A + BY)^ 



(12) 



Here A and B describe the passive and LacY-dependent 
active uptake of the inducing sugar from the environment 
respectively [l8|- The second power in Eq. ([T^ arises 
because two inducers bound to a repressor are needed to 



prevent the repressor from binding to DNA [l9| . The in- 
activation rate (J12p thus results in a nonlinear regulatory 
feedback. 

Fig.[3]shows the distribution p(t) = ~dP°{T > T)/dT 
of gene-on times in the lac example. The model pa- 
rameters were inferred from experimental data of the 
van Oudenaarden lab [81] as explained in the footnote [22] • 
Both the result of the piecewise deterministic approxi- 
mation (solid line), and numeric simulations of the full 
model ([I])-® (black dots) exhibit a broad distribution 
of gene-on intervals, with an exponential cut-off due to 
saturating inducer concentrations. The small difference 
between the gene-on distributions between the two mod- 
els stems from enhancing upward fluctuations at high 
protein numbers, which are absent in the piecewise de- 
terministic approximation. 
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FIG. 3: Distribution of gene on-times in the lac sys- 
tem. The solid curve gives the probability density function, 
calculated from Eq. (|lip . for the lengths of time intervals 
over which the gene is on under the piecewise deterministic 
assumption. A similarly broad distribution is found in numer- 
ical simulations of the model ([l|-(|3]) (black dots). For com- 
parison, the dashed line gives a mixture of two exponential 
distributions corresponding to a mixture of induced and unin- 
duced cells without feedback. Inset: The histogram shows the 
distribution of lacY levels measured by the van Oudenaarden 
lab 8]. Bin widths were chosen to contain equal number of 
data points. The solid curve gives the corresponding analyt- 
ical result of Eq. (I15|l . and the dashed curve the result (|13[) 
for the exponential model without feedback, with parameters 
fit to the data in both cases. 



These results can be contrasted with a simple model 
lacking feedback, where gene-on times are exponentially 
distributed. We consider a mixed population of cells with 
a fraction r of cells having gene inactivation rate /3(0), 
and a fraction 1 — r having rate /3(A). The dashed line 
in Fig. [3] shows the corresponding gene-on time distribu- 
tion r/3(0)e-^(°)^ + (1 - r-)/3(A)e-''('^)^. The stationary 
protein level distributions of the simple mixture of cells 
reads 

fmixiy) = rffK^o) (y) + (1 - ^)//3(A) {y) (13) 



with 
/e(2/) = i?(a/r;,c/,7)-iAi-^ 



'i^yv ^(A-y)^ \ (14) 
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where B denotes the Euler Beta function. This and the 
corresponding distribution ([8]) for the piecewise deter- 
ministic approximation, 

A(y)' 



z 



-y'^-\/^^yT-\-<^+'''^y' 



[■ctan(A+Bt^) 



f{y) 

(15) 

where k = P{A)/r], can both be accurately fitted to the 
histogram of LacY levels experimentally measured in a 
population of E. coli cells [S], see the inset of Fig. |31 
However, the dynamics of the transcriptional state is 
markedly different in the two models, as is evident from 
the gene-on times in Fig. [3l This shows how dynamic 
information, now within experimental reach [J], can dis- 
tinguish between systems with similar statistics of gene 
expression levels. 

In summary, we have shown how feedback shapes the 
dynamics of regulatory networks. This dynamics also 
characterizes transitions in multistable regulatory sys- 
tems. The hysteretic transition from an uninduced state 
to an induced state of the Zac-system [8[ is accompanied 
by a divergence of time intervals over which the lacY- 
gene is transcriptionally active: At low concentrations 
of the inducing sugar, lacY is active only for short pe- 
riods of time governed by the rate of repressor binding. 
At increasing concentrations, the gene on-time distribu- 
tion broadens (Fig. [3] and Eq. pTj) ). and finally at high 
concentrations of the inducing sugar the gene is tran- 
scriptionally active most of the time. Viewing the tran- 
scriptional state S{t) of the gene as a continuous field 
of discrete spin variables in ID, this behaviour can be 
seen as a divergence of magnetic domain sizes: Feedback 
introduces interactions between spins at different times, 
with a range determined by the protein life-time t?"^, 
which also determines the eventual exponential cut-off in 
the distribution of gene-on times. 

We have focused on a doubly-negative feedback-loop 
based on the lac-system. Positive feedback also gener- 
ates non-trivial dynamics, however, it is the inverse of 
gene-off times which turn out to follow a broad distribu- 
tion. Our analysis is not restricted to systems with direct 
autoregulation. Regulatory networks typically contain 
loops generating correlations between different events af- 
fecting the transcription of a gene. These correlations are 
at the heart of deviations from statistical pictures such as 
the Poisson scenario. An example is feed-forward loops, 
where the signal from one gene is recombined with a 
time-delayed copy of itself to produce a simple filter [20| . 
Feedback loops also play a key role in the control of 
ionic channels determining the excitability of the heart, 
where non-exponential distributions of channel-open and 
-closed times have been observed 
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